6 research outputs found

    Analysis and Querying of Health-Related Social Media

    No full text
    The increased popularity of social media and the copious amount of user-generated data in the last few years have impacted various aspects of individuals’ lives. The use of social media for health care related purposes, which is the focus of this thesis, has increased exponentially. This provides the researchers with a massive volume of data that can augment traditional health-related data sources (like electronic medical records) if properly mined and analyzed. Despite the advances in text analytics, it is challenging to analyze this data, due to its specialized vocabulary, the data collection, and the missing values.In this thesis, we focus on two research directions: (a) Analyzing the demographics of users who participate in health-related social media, along with their posted content across a wide range of sources, and highlight specific health issues reported by users. (b) Effectively querying health-related social media or other health-related documents (can be generalized to the problem of querying annotated document). Specifically, in our first contribution, we study the demographics of users who participate in health-related social media, to identify possible links to health care disparities. Using these demographics, our second contribution analyzes the content of posts grouped by demographic segments by implementing information extraction methods to extract medical concepts, identify top distinctive terms, and measure sentiment and emotion. We also extend our content analysis in the third contribution by studying the intent of posts generated by users for different data sources. Lastly, we focus on a specific domain, electronic cigarettes, and analyze the health-related effects reported by online users.In the second direction of this thesis, we developed a query framework to help users efficiently explore health-related data, present in either online social media or other medical documents, by exploiting the relationships between the network users or the concepts inside the documents. Our solution is generalized to other domains with similar properties, such as general purpose social networks. We refer to this problem as keyword querying on graph-annotated documents, where we query documents annotated by interconnected entities, which are related to each other through association graphs. Our novel framework balances the importance of text relevance and semantic relevance through the graph

    Querying Documents Annotated by Interconnected Entities

    No full text
    In a large number of applications, from biomedical literature to social networks, there are collections of text documents that are annotated by interconnected entities, which are related to each other through association graphs. For example, social posts are related through the friendship graph of their authors, and PubMed articles area annotated by Mesh terms, which are related through ontological relationships. To effectively query such collections, in addition to the text content relevance of a document, the semantic distance between the entities of a document and the query must be taken into account. In this paper, we propose a novel query framework, which we refer as keyword querying on graph-annotated documents, and query techniques to answer such queries. Our methods automatically balance the impact of the graph entities and the text content in the ranking. Our qualitative evaluation on real dataset shows that our methods improve the ranking quality compared to baseline ranking systems
    corecore